Once you have run sourcetracker2 to produce the mixing proportions of our sources to your sink samples, you'll likely want to visualize the results for understanding your scientific question of interest. Given that many individuals are using IPython notebooks inline for analysis, I'll demonstrate some quick ways to visualize the results in a notebook using pandas.
The results files are simple tab-delimited text documents, which will make importing into R, Excel, MATLAB, or your favorite visualization package very easy!
In [2]:
# Import packages of interest
# You might need to install these in your local environment
# which can be easily accomplished with $pip (package)
import matplotlib.pyplot as plt
import pandas as pd
%matplotlib inline
In [4]:
# Move into our tiny test directory
cd ../data/tiny-test/
In [5]:
# read in the mixing proportions result file to a pandas DataFrame
# sep='\t' to denote tab delimited file
# index_col=0 to denote that pandas should set the index values as the SampleIDs
results = pd.read_csv('mixing_proportions/mixing_proportions.txt', sep='\t', index_col=0)
results
Out[5]:
The above table shows us that we had 5 sink samples and 4 possible source environments (including the Unknown).
The pandas package has built in plotting features (built upon matplotlib), that allow for quick visualization.
In [7]:
results.plot(kind='bar', grid=True, figsize=(8,6), ylim=(0,0.5))
Out[7]:
If the user wanted to plot the standard deviations for the draws for estimating the mixing proportions, simply create a new pandas dataframe with the mixing_proportions_stds.txt, and pass that dataframe into the plotting function with the yerr argument.
In [11]:
stdevs = pd.read_csv('mixing_proportions/mixing_proportions_stds.txt', sep='\t', index_col=0)
In [9]:
stdevs
Out[9]:
In [12]:
# Plot mixing proportions with yerr
results.plot(kind='bar', grid=True, figsize=(8,6), ylim=(0,0.5), yerr=stdevs)
Out[12]:
Pandas also allows the user to specify which columns of interest to plot:
In [19]:
# Plotting only the drainwater source
results['drainwater'].plot(kind='bar', grid=True, figsize=(8,6), ylim=(0,0.5), yerr=stdevs, color='pink', title='Mixing Proportions')
Out[19]:
Or, if the user wants to use subplots, pandas also allows for that:
In [20]:
# Plot mixing proportions with yerr
results[['drainwater', 'seawater']].plot(subplots=True, kind='bar', grid=True, figsize=(8,6), ylim=(0,0.5), yerr=stdevs)
Out[20]:
Matplotlib options
Because pandas uses matplotlib in the background, you can use the matplotlib API to alter your graphs:
In [30]:
# set figure in matplotlib
fig, ax = plt.subplots(1,1)
# read in the dataframe
# use 'ax=ax' to assign the dataframe to the matplotlib axes object
results.plot(kind='line', lw=5, ax=ax)
# set options
ax.set_ylabel('Proportions')
ax.set_xlabel('Sample')
ax.set_title('Mixing Proportions')
# move legend
ax.legend(bbox_to_anchor=(1.4, 0.4))
Out[30]:
In [ ]: